Corpora and Translation: Uses and Future Prospects

نویسندگان

  • Tony McEnery
  • Andrew Wilson
چکیده

Although corpora have been an object of study for some decades, the nineteen eighties saw an increased interest in their use and construction. With this increased interest and awareness has come an expansion in the application areas for which corpus based approaches have been deemed relevant. This paper will seek to define the concept of a corpus, and discuss its relevance to two application areas in particular, automatic and manual translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

استخراج پیکره‌ موازی از اسناد قابل‌مقایسه برای بهبود کیفیت ترجمه در سیستم‌های ترجمه ماشینی

Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...

متن کامل

Corpus-Centered Computation

To achieve translation technology that is adequate for speech-to-speech translation (S2S), this paper introduces a new attempt named Corpus-Centered Computation, (abbreviated to C and pronounced c-cube). As opposed to conventional approaches adopted by machine translation systems for written language, C places corpora at the center of the technology. For example, translation knowledge is extrac...

متن کامل

Small Hydro-Power Plants in Kenya: A Review of Status, Challenges and Future Prospects

Small Hydro-power Plants (SHP) are an important source of electricity in many countries. However, little is known about SHP in Kenya. This paper reviews the status, challenges in implementation of SHP and prospects for future development of SHP in Kenya. The paper shows that SHP has not yet fully utilized the available hydro-power potential. The challenges associated with SHP development should...

متن کامل

Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora

tra Statistical machine translation systems are usually trained on large amounts of bilingual text and monolingual text. In this paper, we propose a method to perform domain adaptation for statistical machine translation, where in-domain bilingual corpora do not exist. This method first uses out-of-domain corpora to train a baseline system and then uses in-domain translation dictionaries and in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993